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ABSTRACT 



Indices were constructed to measure individual 



differences in the effects of the automated testing format and 
repeated testing on Minnesota Multiphasic Personality Inventory 
(MMPI) responses. Two types of instability measures v;ere studied 
within a data set from the respoi ses of 150 undergraduate students 
who took a computer-administered and pencil-and-paper MMPI a week 
apart. Two subject groups included 42 males and 33 females each. One 
set of indices measured systematic format- and time-related changes 
in responding, shifting attributable to format or time alone. Two 
families of six indices each were computed measuring unsystematic 
changes in responding, or overall tendencies to shift in a particular 
direction among the responses "true," "false," and "cannot say." 
These unsystematic chang-.s were asses'-^d both between formats and 
across times, although they were partially confounded the present 
study. Systematic format shifting was related to a more general and 
unsystematic tendency to shift between "true" and "false" responses. 
The use of "cannot say" in the computerized testing situation appears 
distinct from the tendency to use the "cannot say" response on the 
pencil-and-paper test. Systematic item shifting attributable to time, 
although not involving an internally consistent set of responses, is 
distinct from other instability indices derived in this study and is 
therefore sensitive to the design of the administration software. 
Personality and other correlates of the item-shifting indices are 
discussed. Five tables present ?tudy data. (Author/SLD) 
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Format sensitivity scales 1 
Abstract 

Indices are constructed to measure individual differences in 
the effects of the automated testing format and repeated testing 
on MMPI responses. Two types of instability measure are studied 
within a data set from the responses of 150 undergraduate subjects 
who took a computer-administered and Pencil and paper MMPI a week 
apart. One set of indices measures systematic format- and time- 
related changes in responding, shifting attributable to format or 
time alone. Two families of six indices each are computed 
measuring unsystematic changes in responding, overall tendencies 
to shift in a particular direction between the particular 
responses, "True", False", and "Cannot Say". These Unsystematic 
changes are assessed both between formats and across times, 
although these factors are partially confounded in the present 
study. Syswematic Format shifting is related to a more general, 
unsystematic tendency to shift between "True" and "False" 
responses. The use of "Cannot Say" in the computerized testing 
situation appears distinct from the tendency to use "Cannot Say" 
on the Pencil and Paper test. Systematic item shifting 
attributable to Time, although not involving an internally 
consistent set of responses, is distinct from the other 
instability indices derived in this study. Personality and other 
correlates of the item-shifting indices are discussed. 
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Indices of individuals' sensitivities to computerized 

test administration and repeated testing 
This paper investigates individual differences in the 
effects of automated test administration and repeated testing on 
subjects' responses to the Minnesota Multiphasic Personality 
Inventory (MMPI; Hathaway and McKinley, 1967). The effects of 
different assessment formats (e.g.. Computer-administered vs« 
Pencil and paper), as well as the effects of taking an instrument 
more than once, may be the same for everyone who takes a test. 
Or, particular people may be sensitive to different 
administrations in particular ways. Differences have been found 
in test results and cxient attitudes related to computerized tests 
(Ben-Porath and Butcher, 1987), and it is important to evaluate 
format equivalence and the size of mode or format differences when 
using a computerized test (Butcher, 1987; Honaker, 1987) with a 
particular patient. 

Research to date has indicated that the effects of 
computerized testing are relatively small. The early prediction 
of increased candor and less defensiveness in computerized 
assessment has not been borne out in research on objective 
personality tests. A number of studies using MMPI scales have 
employed a variety of designs to test this hypothesis and to 
examine format equivalence in general (Bresolin, 1984; Biskin and 
Kolotkin, 1977; Evan and Miller, 1969; Hart and Goldstein, ^1985; 
Koson, Kitchen, Kochen and Stodolosky, 1970; Lambert et al 1987; 
Lushene, O'Neill, and Dunn, 1974; Russell, Peace, and Mellsop, 
1986; Schuldberg, in press; White, Clements and Fowler, 1985). 
Overall, computerized administration tends to produce less 
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elevated MMPI profiles than traditional procedures, even when test 
takers' use of the "Cannot Say" or unscoreable response is 
controlled. These format effects have tended to be small, 
although differences emerge when robust designs are employed. 
Research has almost exclusively dealt with scale rather than item 
equivalence. 

For a number of instruments, "Cannot Say" responses are 
given consistently more often with computer administration, 
unless the testing software is designed to make this response 
more difficalt; this accounts for some (but not all) of the early 
format differences observed in scale scores (Moreland, 1987). 
This differential use of the "Cannot Say"response can be 
controlled for most computer -administered tests^. 

It is difficult to disentangle the effects of repeated 
testing from format effects when test-retest designs are used in 
format equivalence research (and most researchers agree that these 
are the designs of choice), unless fairly complex experimental 
designs are used (see Schuldberg, in press). One beneficial 
effect of current research on automated testing is to focus more 
attention on retest effects in objectiv personality instruments, 
issues of person reliability, and more general issues of test 
occasion equivalence. 

Before the era of computerized assessment, format equivalence 
research was concerned with similarities and differences between, 
the card, booklet, and tape-recorded, as well as various shortened 
forms, of the MMPI (see Dahlstrom, Welsh, and Dahlstrom, 1972, pp. 
24-28) . To date, relatively little research has been done on 
individual differences in format effects, their correlates, and 
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possible underlying psychological processes (Honaker, 1987), 
despite the fact that a literature exists on individual 
differences in retest effects • The research reported here uses a 
variety of techniques to generate measures of both format and 
retest effects. The experimental design employed provided a 
partial separation of retest and format effects. However, this 
design, which use^ two groups of subjects receiving two forms of 
the test in counterbalanced order, cannot detect "sensitization" 
effects for test format, effects related to which form the subject 
experienced when first exposed to the instrument. In addition, 
retest effects are partially confounded with format effects in 
some of the measures of item response instability. 

Measures of unsystematic instability in responding . 

Previous research on the temporal stability of personality 

profiles has studied change in either an individual's personality 

profile or in items across two separate testing occasions 

(Dahlstrom, Welsh, and Dahlstrom, 1975; Fekken and Holden, 1987; 

Goldberg, 1978; Goldberg and Jones, 1969; Lewinsohn, 1965; 

Mauger, 1972; Mills, 1954; Pepper, 1964; Schofield, 1950; Windle, 

1954, 1955). Measurement of change in responding at the item 

level is based on the number of items that are changed between 
3 

testings c Indices of the total amount of item shifting are 
non-directional because they are computed without regard to the 
direction of change (e.g., "False-True" vs. "True False" vs. 
"Cannot Say-True", etc.). This paper also refers to such change 
indices as tapping Unsystematic instability because they do not 
take into account the properties of the particular items (or 
scales) that shift for a given subject, or the direction of the 
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shifts for particular items; the items are not keyed • In 
addition, if a subject is given two different forms of a test on 
two different occasions, an unsystematic and non-directional 
instability index counting total number of changed items combines 
the shifting attributable to repeated testing and to changes in 
the test format. 

The directional unsystematic shifts in which the subject 
answered "Cannot Say" in the computerized administration and gave 
a scoreable "True" or^Valse" response in the Pencil and paper 
condition are especially interesting, due to the increased use of 
the "Cannot Say" response on computerized instruments such as the 
present one where this response is not limited ^y the 
administration software. 

Test-retest effects, inconsistent in pattern across studies 
although sometimes significant in magnitude, have 

generally been minimized in work with the MMPI. However, an 
important aspect of research on changes in responding across time 
involves treating profile or item instability as an individual 
difference variable and deriving MMPI indices of person 
reliability. When profile or item instability is treated as a 
trait, the investigator can then use empirical scale development 
techniques to derive items that predict test takers' individual 
levels of the trait. Such a scale may exclude the unstable items 
themselves (e.g. Pepper, 1964). 

The present present paper separates Unsystematic instability 
into its directional components and computes twelve indices of 
Unsystematic item instability. These indices count test takers' 
3hifts in item responding between "True", "False", and "Cannot 
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Say", in each of six possible orders, tallied between formats or 
across times. 

Indices of Systematic profile instability. 

The derivation of Systematic instability indices differs from 
methods for constructing overall or Unsystematic instability - 
3cales, Previous research using the same subject pool 
(Schuldberg, 1987; in press) derived indices of systematic 
response shifting occurring either between test formats or across 
repeated testings, A set of forty MMPI items showed significant 
effects for format of administration but not for repeated testing 
in item-level crossover analyses of variance (Edwards, 1968; 
Winer, 1962), These items provide a measure of systematic scale 
instability because, as a group, subjects* responses to them tend 
to be different in^ a^ particular direction between two testing 
formats but not across two times. Briefly, these items tend to be 
scored in opposite directions from and tend to be negatively 
correlated with the MMPI scales (a notable exception being the K 
scale), another finding counter to the hypothesis of increased 
candor in the computerized testing format. A larger number of 
items (seventy-five) showed significant effects for repeated 
testing alone. These seventy-five items are used as an index of 
Systematic time instability^. 

It was hypothesized that four distinct types of shifting 
would emerge with different correlates on the Mi*lPI and other 
measures: 1) Shifting to "Cannot Say" on the computerized 
administration; 2) Systematic shifting between "True" and "False" 
responses between test formats; 3) Systematic shifting between 
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•True" and "False" response across times; and 4) A general index 
of carelessness reflected in shifts between "True" and "False" 
both across occasions and between formats. 

Methods 

Subjects and procedures 

The subjects in this research were drawn from a pool of 
students from an Introductory Psychology class who signed up for 
a personality assessment study in partial fulfillment of a course 
experimental requirement. Subjects responding with thirty or 
more unscoreable or "Cannot Say" responses to either form of the 
test were eliminated. Additional subjects were dropped randomly 
from the smaller group in order to create two groups of seventy- 
five subjects each, matched for number of males and females. This 
resulted in a sample of 150 students, in two groups composed of 42 
males and 33 females. 

Subjects were tested twice, in counterbalanced order, with 
two forms of the test given approximately a week apart. The 
Pencil and paper version of the MMPI presented the group form 
items xn test booklet form. Subjects were tested in a classroom, 
in groups of up to twenty-five, and were also given the Shipley 
Institute of Living Scale (Zachary, 1986) after the MMPI. The 
automated MMPI was administered to groups of students on Apple 
lie computers in a microcomputer teaching lab containing 25 
machines^. After the students* final test administration (either 
Pencil and paper or Computer), they were given a brief 
questionnaire about their experience with computers, with 
objective personality tests, and their perception of the degree 
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to which the two personality inventories were the same. 

Construction of instability indices 
Systematic item instability * 

An item was selected for one of the two indices of 
Systematic instability if it showed a relatively unambiguous 
format or time effect in an item-level crossover ANOVA 
(Schuldberg, in press). An item with unambiguous format effects 
showed a significant main effect for format alone (not for time 
or group). An item showing an unambiguous effect for time was one 
with a significant main effect fur time, but not for format or 
group. These items were keyed to indicate the direction that — 
overall — the group of 150 test takers shifted on the item either 
from the first to the second administration or between the 
Computer and the Paper and pencil administrations. Only shifts 
involving "True" and "False" responses are included in the two 
Systematic shifting indices. 

For each subject, the number of shifts in the keyed direction 
for both the format-sensitive and the time-sensitive items was 
tallied. Each shift in the keyed direction on a selected item was 
given a weight of +1 ; each shift opposite to the keyed direction 
was given a weight of -1 ; items that did not shift were not 
counted. This resulted in a score for each subject on two scales: 
Systematic Format sensitivity and Systematic Time sensitivity. 

The internal consistency of the two measures of Systematic 
instability is low (for Systematic Format instability, Cronbach's 
alpha = 0.25; for Systematic Time instability, alpha ^ 0.21), 
indicating that although these items shift significantly in a 
particular direction between formats or across times for the 
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subjects as a group, particular subjects were not consistent in 
their shifting on these items* 

Unsystematic item instability > 

Two families of measures of raw, unsystematic item shifting 
were constructed. These measures are counts of the number of 
items for wliich the subject changed his or her response between 
administrations. Given that a subject answers an item as 'True", 
"False", or "Cannot Say" on two separate occasions, nine 
sets of responses are possible (see Table 1); six of these 
represent response shifts. Although order of administration and 
format are partially confounded in the present design, item shifts 
can be counted separately for shifts between formats and shifts 
across occasions. 



Insert Table 1 about here 



A tally of each of the six types of item shift between 
Computer and Pencil and paper conditions across all 566 items was 
made for each subject, as well as a total Unsystematic instability 
score (the sum of the six instability indicators)". This 
essentially follows one of Pepper's (1964) strategies for 
assessing change in responding, although the present s ly differs 
from Pepper's in examining specific kinds or directions of item 
shift (e.g., from "True" to "False")^. In the same way, six 
measures were computed for the shifting of responses to the 566 
Mi*lPI items across the first and second occasions of testing. It 
would be possible to compute an alpha to assess the internal 
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consistency of each Unsystematic index; however, since each index 
contains 566 items, this task was beyond the computational 
capacity available « 

Construction of Unsystematic "True-False "indices free of 
overlapping item s for the factor analyses 

As described above, the indices of Systematic shifting due 
to either time or format measure shifting in a keyed direction 
across times or between formats involving "True"and "False" 
responses. The items on these indices are all included on the 
Unsystematic "True-False" and "False-True" shifting indices 
tallied between formats or across times for all 566 items • For 
the factor analytic studies, special indices of Unsystematic item 
shifting between "True" and "False" (and "False" and "True") 
responses between formats and across times were constructed that 
excluded the items on the Systematic format or Systematic time 
indices • While these non-overlapping Unsystematic indices are 
highly correlated with the basic indices computed over all 566 
items, they are used in the factor analyses in order to reduce 
spurious correlation among the variables introduced by item 
overlap. 

Other measures 

Relationships between the indices of Systematic and 
Unsystematic item instability and profile validity (assessed by 
the four MMPI validity scales), as well as subjects* personality 
characteristics (measured by the MMPI clinical scales), scores on 
the Shipley Institute of Living Scale (Zachary, 1986), age, and 
computer experience are examined* 
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Computer and related experience was measured using responses 
to three seven-point Likert-type ite2»::«i "How much experience with 
computers have you had before this experiment?" (rated from "no 
experience" to "a great deal of experience"), "How often do you 
type or use a typewriter keyboard (such as on a computer 
terminal)?" (rated from "never" to "every day"), and "How often do 
you play with vidao games or computer games?" (also rated from 
"never" to "every day")* The average of the responses to these 
three items served as an index of computer and related experience • 
Cronbach^s alpha for this three-item scale is 0o61* 

Analyses 

Correlates of each of the two Systematic instability indices, 
the twelve Format and Time Unsystematic instability indices, and 
the total Unsystematic instability index are examined and 
discussed. Two Principal-components factor analyses with Varimax 
rotation are used to examine the factors underlying the item 
instability indices. Since the Jnsystematic indices of item 
shifting confound the effects of time and administration format, 
separate analyses are conducted for the Format and Time indices, 
in each case including the Systematic Instability index and the 
six format or time Unsystematic instability indices • 

Results 

Subjects shifted most often between "True" and "False" 
responses (or vice-versa), changing an average of 43.8 items in 
either direction. On average, the test takers made shifts of any 
kind in their responses to 9A.1 items, higher than the value of 
67.0 reported by Fekken and Holden (1987) for shifting across 
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repeated testings alone • The Total Unsystematic shifting index 
ranged from 32 to 249 items. 

Table 2 presents the correlations of the Format instability 
indicators, both Systematic and Unsystematic, with the MMPI 
scales and total Shipley score. The Systematic Format 
instability index is significantly negatively correlated only 
with MMPI Scale 3 (Hysteria) and the Shipley score, as well as 
showing weak positive correlations with several MMPI scales 
reflecting deviant ' js. The Unsystematic index of shifting 
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from "True" in the Computerized administration to "False" in the 
Pencil and paper condition is positively correlated with F and 
seven of the clinical scales, as well as being negatively 
correlated with L, K, Scale 3 again,* and the Shipley score. 
Correlations for the Unsystematic shifting indices involving 
"Cannot Say" are mainly related to "Cannot Say scores" 
themselves, with small negative correlations occurring for the K 
scale (for "Cannot Say-True" shifts) and Shipley score. The 
total Unsystematic shifting index (consisting of the sum of all 
item shifts between administrations) is positively correlated 
with "Cannot Say" and F, negatively correlated with K, a finding 
consistent with other research (Fekken and Holden, 1987; Windle, 
1954), highly negatively correlated with the Shipley score, and 
positively correlated with four clinical scales. 

A similar pattern of results emerges for the Unsystematic 
indices of item shifting across times (See Table 3). The 
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Systematic Time Instability index, computed on the basis of keyed 
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item shifting on the seventy-five items showing significant 
effects for Time in earlier crossover ANOVA's, is negatively 
correlated with L -0,30, £< O.OOl). Unsystematic "True- 

False" shifting from first to second administration is positively 
correlated with F and four clinical scales, and negatively 
correlated with the K scale and total Shipley score. Shifting 
involving "Cannot Say" is, of course, related to raw "Cannot Say" 
scores, as well as showing small correlations in different 
directions on two clinical scales • None of the instability 
indices is correlated with age or the measure of computer and 
related experience,. 

Factor analyses were conducted to examine the relationships 
among various instability indices. The first factor analysis 
includes the Format Instability indicators and the second the Time 
Instability Indices; these analyses were conducted separately, due 
to the fact that the Unsystematic indices of Time and Format 
shifting are confounded. In each factor analysis, the special 
indices of Unsystematic "True-False" and "False-True" shifting 
described above are used; the items keyed "True-False" or "False- 
True" on the Systematic shifting index entering in the same 
analysis are excluded. This eliminates any relationships among 
the variables in a given analysis solely due to common items. A 
three-factor solution was derived in each analysis. 

The first factor analysis included the Systematic Format 
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shifting index and the six indices of Unsystematic shifting acroi^s 
Formats. Three factors accounted for 78% of the variance in these 
measures. The first factor includes the measures tb'\t refer to 
use of the "Cannot Say" response in the Pencil and paper Format. 
The second factor contains the two indices involving the "Cannot 
Say" response in the computerized test format. The third factor 
contains the two Unsystematic indices of shifting between "True" 
and "False" responses between formats (both "True-False and 
"False-True"), as well as the Systematic Format Index, which also 
refers to shifting between "True" and "False" and vice versa. 

Similar results are obtained in the second factor analysis, 
which includes the Systematic and Unsystematic Time indices. 
Three factors accounted for 74% of the variance in this analysis. 
Again, two "Cannot Say" factors emerge (one for Time 1 and one for 
Time 2), along with a factor on which the Unsystematic "True- 
False" shifting indices load. Unlike the Systematic Format 
Instability index, the Systematic Time Instability index shares 
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virtually no common variance with the three factors. 

Discussion 

With regard to the specific hypotheses of this study, it 
does appear that use of the "Cannot say" response is different 
under the two test formats. However, there do not appear to be 
several distinct varieties of "True-False" shifting. This type 
of response alternation emerges as a unitary phenomenon, 
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regardless of the direction of the shift or whether the 
particular items involved are "format sensitive" or not. This 
'^True-False" shifting appears to be related to a more deviant (or 
careless) response set, and co-occurs with more elevated overall 
profiles. The Systematic Time shifting scale, on the other hand, 
is unrelated to the other indices in this study. 

The measures of Systematic instability due to Format or Time 
have low alpha 's, indicating that test takers do not shift their 
responses to a particular consistent set of MMPI items, either 
between administration formats or across occasions. Neither index 
taps a consistent pattern of individual differences in response 
shifting. The systematic tendency to shift items between "True" 
and "False" across time was unrelated to the other instability 
measures, and may possibly be related to a general tendency to 
respond to the test with candor; the Systematic Time shifting 
indicator is negatively correlated with the MMPI "Lie" scale. 
Although research has suggested that response shifting on retest 
may occur in the direction of increased Social Desirability 
(Windle, 1954), the present finding indicates that the particular 
subjects who shift in this way may be more candid ones. 

This study finds at least three different types of 
instability in item responding. The first is a general tendency 
for some test takers to shift between "True" and "False" responses 
when taking the MMPI twice. It is unclear whether this shifting 
is primarily attributable to time or format, although it appears 
doubtful on the basis of previous research that some test takers 
shift between '^rue" and "False" responses on the basis of the 
format of the test alone; when retest involves a different test 
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format, however, the average number of items changed is greater 
than reported elsewhere in the literatur'*. The general tendency 
to shift between "True" and *Talse" responses to the test may 
reflect invalid or "unreliable" responding, as it is associated 
with elevated scores on F and more generally elevated profiles. 
The negative correlations between the "True-False" shifting 
indices and the Shipley score may indicate that this represents 
an "intellectually easy" approach to the test or reflects a 
cognitive deficit. However, these negative correlations may 
also indicate that test takers who made a large number of "True- 
False" shifts (or vice versa) were poorly motivated in the 
testing situation and responded to the Shipley and at least one 
of the MMPl's in a haphazard fashion. 

The use of the "Cannot Say" response contains two underlying 

Q 

factors, related either to Format or Time of administration . The 
two factor analyses taken together tend to muddy the ef . ^cts of 
Format and Time on "Cannot Say" scores. However, as a group, the 
test takers' use of the "Cannot Say" response shows a highly 
significant effect for Format and not for Time (Schuldberg, in 
press); this indicates that the two factors underlying the "Cannot 
Say" response are "Computer format" and "Pencil and paper format", 
as emerged in the first factor analysis. The two "Cannot Say" 
factors from the second factor analysis appear to be an artifact 
of the confounding of format and time effects. The main finding 
of the second factor analysis is that the Systematic Time 
shifting index, computed from each subject's keyed shifting on 
the seventy-five items showing significant time effects, is 
unrelated to the other indices of response variability in this 
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study. Although this Systematic Time index has low internal 
consistency, it taps a distinct domain of response tendencies not 
measured by the other indices in this study. The present 
research suggests the importance of continued awareness and 
renewed examination of the effects of repeated testing on 
subjects* MMPI responses. 

In conclusion, the shifting of item responses between two 
different test formats appears to be an inconsistent although 
measurable phenomenon, mainly reducible to "True-False" shifting 
and to differential use of "Cannot Say" in some computerized 
tests. The "Cannot Say" response appears to represent a separate 
phenomenon in computerized and traditional MMPI administration, 
and is sensitive to the design of the administration software. 
Systematic Time shifting is a more substantial and robust effect 
than Format shifting; however, its relationship to other person 
variables remains something of a mystery. 
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Footnotes 

^However, responding to certain types of tests may be 

fundamentally ch nged by the automated format. For example, the 

Adjective Check List may become more like a forced choice 

instrument involving paired comparisons for each adjective in the 

automated format, resulting in very different patterns of 

responding (see Harris and Allred, 1987) • 
2 

This research turned up significant but inconsistent patterns of 
format differences. 

Items that recur within the test can also be examined for 
consistency within a single occasion of testing. 
^Items showing effects for group or effects for a combination of 
time, group, and format were discarded as exhibiting ambiguous 
and uninterpretable effects. The composition of the Systematic 
Format sensitivity index is as follows: Items keyed True ( n_ = 
55): 3, 7, 8, 9, 36, 46, 68, 73, 79, 88, 96, 98, 99, 118, 119, 
128, 130, 131, 133, 146, 155, 160, 164, 175, 188, 221, 222, 228, 
229, 230, 254, 261, 264, 274, 277, 302, 304, 306, 307, 318, 346, 
369, 372, 376, 377, 379, 381, 428, 434, 435, 495, 522, 523, 537, 
542. Items keyed false ( n = 20): 16, 22, 62, 71, 84, 90, 93, 
134, 215, 217, 297, 313, 332, 334, 390, 397, 436, 497, 543, 560. 
(Numbers refer to the booklet form of the MMPI.) The keying of 
these items is based on the direction of group changes from the 
first to the second administration. An item is keyed "True" when 
it was answered significantly more often as "False" on the first 
administration and "True" on the second. An item keyed "False" 
was answered significantly more often as "True" on the first 
administration and "False" on the second. 
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^he items were presented in Form R order in the computer 
administration condition and in booklet form order in the Pencil 
and paper condition , representing a confound of test format and 
order of the items in v^be present study • 

^Only one total Unsystematic item instability score was computed 
because the overall, total Unsystematic Instability measure 
collapses Format and Time effects; the value of the total 
Unsystematic instability measure is identical whether it is 
computed using the individual Format or Time indices* 
^In contrast to Pepper's (1964) work, the present research does 
not construct a separate scale for predicting subjects' scores on 
the shifting indices. 

^h ese findings may be compared to Edwards and Walsh's (1964) 
analysis of "Cannot Say" scores* Examining responses from Pencil 
and paper tests, these authors found three inter pretable factors 
underlying "Cannot Say" responses, in particular that "Cannot Say" 
responses are different for True-False and forced-choice items. A 
similar distinction may be u^=*=^ful in classifying different type of 
automated assessment formats (see footnote 1). 
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Table 1. 

Systematic and Unsystematic Instability Indices > 

Systematic Instability Indices > 

Format: 40 items showing significant response differences in 
a particular direction attributable to format. 

Time: 75 items showing significant response differences in a 
particular direction attributable to time. 

Unsystematic Item shifts tallied between Formats and across Times 
Instability indices: 



T-? 



F-T 
F-? 



(These shifts are assessed both 
between formats and across times.) 



?-T 
?-F 



Stable response combinations: 



F-F 
?-? 
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Table 2. 

Correlation of Format instability scales vith MMPI scales and Shipleyo 



Instability Scale 



Systematic Unsystematic Format Instability Total 

Format — Unsystematic 

Measure Instability T-F T-? F-T F-? ?-T ?-F Instability^ 



MMPI scales 





-.13 


-.06 


.11 


.08 


.16 


.89*** 


.92*** 


.22** 


L 


.09 


-.20* 


-.05 


.04 


.01 


-.04 


.08 


-.09 


F 


.15 


.42*** 


.04 


.16 


.05 


.10 


.01 


.35*** 


K 


-.10 


-.50*** 


-.01 


-.14 


.01 


-.17* 


-.08 


-.40*** 


1 


.03 


.26*** 


.12 


.24*** 


.11 


.07 


.03 


.31*** 


2 


-.03 


.06 


-.01 


.03 


.10 


-.08 


-.09 


.04 


3 


-.18* 


-.22** 


.11 


.08 


.06 


-.00 


-.05 


-.09 


4 


-.00 


.19** 


.14 


.10 


.12 


-.02 


.00 


.18* 


5 


-.03 


.08 


.12 


-.04 


.04 


-.11 


-.16 


.00 


6 


.01 


.17* 


.01 


.01 


-.04 


-.04 


-.07 


.09 


7 


.06 


.47*** 


.07 


.13 


.02 


.08 


-.01 


.36*** 


8 


.16 


. 49*** 


.07 


.17* 


.02 


.11 


.03 


.41*** 


9 


.15 


, 4Q*** 


.02 


.08 


-.07 


.00 


.02 


,29*** 


0 


.12 


.19* 


-.03 


.09 


.04 


-.04 


-.10 


.15 


Shipley^ 


-.17* 


-.41*** 


-.15 


-.43*** 


-.02 


-.14 


-.17* 


-.52*** 



* .05 **£< .01 ***£<^ .005 Two-tailed tests of significance. 

Note; MMPI ?cale scores are taken from the computerized administration of the MI'^IPI. 
Results were similar for scores obtained from the Pencil and paper version of the 
MMPI. 

^Total raw score on the Shipley Institute of Living Scale. 

Due to the fact that format and time of administration are not independent factors 
in this study, the total Unsystematic instability index, computed as the sura of all 
six Unsystematic instability indices, is the same regardless of whether the 
individual format or time measures are used. 
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Table 3. 

Correlation of Time instability scales with MI^PI and Shipley > 

Instability Scale 
Systematic Unsystematic Time Instability 



Measure 


instaDiiity 


rn -ri 

1— r 


T 0 

1—: 


r— 1 


T? 0 

r-: 


0 T 
I— 1 


:— r 


nrlrl SCaies 
















9 




— •UO 


RQ*** 


nA 
.Uo 


(^Q*** 


(^A**de 


Al dede* 


T 

L 




no 




no 
— .Uo 


. 1 J 


1 7 
— . IZ 


nA 

— .Uh 


r 


(17 




nn 

.UU 




n*5 
— .Uj 


1/. 
. I'* 


n7 

.u/ 


A. 






nft 
— .Uo 


*5Q*** 


nA 

.Uh 


— . 10 


1 


1 
1 


nn 




no 


OQ*** 


no 
— .Uo 


. lo 


. ly^ 


z 


n/. 


• UO 


— • l^f 


n/» 


— . 10 


nQ 
.UO 


no 
.uy 






— • UO 


■~oUj 


— no 

— .U7 


_ in 

— . lU 


in 

. iU 


nft 

.Uo 


/, 

H 




• 10 


no 

— .uy 




-.U-) 


1 Q 
. 10 


1 7 

. iz 


5 


-.12 


.01 


-.04 


.04 


-.08 


-.02 


-.09 


6 


.01 


.06 


-.17* 


.13 


-.10 


.11 


.00 


7 


.08 


.26*** 


.00 




-.11 


.14 


.12 


8 


.09 


.26*** 


.04 


,41*** 


-.08 


.14 


.14 


9 


.05 


.24*** 


-.01 


.26*** 


-.00 


.02 


-.01 


0 


.02 


.11 


-.10 


.17* 


-.17* 


.03 


.08 


Shipley^ 


-.08 


-.43*** 


-.13 


-.42*** 


-.05 


-.13 


-.17* 



* £. £. Two-tailed tests of significance. 

** £<. -Ol 
***£<. .005 

Note: MMPI scale scores are based on uhe computerized administration of the MI'IPI. 
Results were •'imilar for scores obtained from the Pencil and paper version of the 
MMPI. 

^Total raw score on the Shipley Institute of Living Scale. 
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Table 4* 

Factor Analysis of the item instg ' -ity measures; Format 



Response 
Measure 


Factor 
1 

"Cannot 
Say: 

P and P" 


Factor 
2 

"Cannot 
Say: 

Computer" 


Factor 
3 

"TF/FT 
Shifti- 
ness" 


Communal ity 


Systematic Shift 










Format 


-0.08 


-0.13 


0.69 


0.50 


Unsystematic shifts 
(Computer to 
Pencil and Paper): 










-0.10 


n no 
U.U/ 


u. /y 


U.OJ 


T-? 


0.99 


-0.01 


-0.05 


0.98 


F-T^ 


-0.06 


0.04 


0.79 


0.63 


F-? 


0.99 


-0.00 


-0.07 


0.98 


?-T 


-0.00 


0.93 


0.05 


0.88 


?-F 


-0.01 


0.93 


-0.07 


0.87 


Percent variance 


0.28 


0.25 


0.25 


Total 



accounted for 



0«78 



^These Unsystematic T-F and F-T indicators exclude the MMPI items 
on the Systematic Format shifting index. 

Note ; 

Loadings for the variables used in naming the factor are underlined • 
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Table 5. 

Factor Analysis of the item instability measures; Time 



Response 
Measure 


Factor 
1 

"Cannot 
Say: 
Time 1" 


Factor 
2 

"Cannot 
Say: 
Time 2" 


Factor 
3 

"TF/FT 
Shifti- 
ness" 


Communality 


Systematic Shift 










Time 


-0.26 


-0.30 


-0.04 


0.16 


Unsystematic Shifts 
(First to second 
administration) : 










-0.06 


-0.10 


0.84 


0.71 


T-? 


-0.04 


0.93 


-0.05 


0.86 


F-T^ 


0.02 


0.00 


0.86 


0.74 


F-? 


-0.04 


0.93 


-0.03 


0.87 


?-T 


0.96 


-0.03 


0.03 


0.93 


?-F 


0.96 


-0.03 


-0.05 


0.92 


Percent variance 


0.27 


0.26 


0.21 


total 



accounted for 



0.74 



^The Unsystematic T-F and F-T indicators exclude the MMPI items on 
the Systematic Time shifting index. 

Note: 

Loadings for the variables used in naming the factor are underlined. 
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